Heterogeneous Distributed Big Data Clustering on Sparse Grids

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

Efficient Regression for Big Data Problems using Adaptive Sparse Grids

The amount of available data increases rapidly. This trend, often related to as Big Data challenges modern data mining algorithms, requiring new methods that can cope with very large, multi-variate regression problems. A promising approach that can tackle non-linear, higher-dimensional problems is regression using sparse grids. Sparse grids use a multiscale system of grids with basis functions ...

متن کامل

Collective, Hierarchical Clustering from Distributed, Heterogeneous Data

This paper presents the Collective Hierarchical Clustering (CHC) algorithm for analyzing distributed, heterogeneous data. This algorithm rst generates local cluster models and then combines them to generate the global cluster model of the data. The proposed algorithm runs in O(jSjn 2) time, with a O(jSjn) space requirement and O(n) communication requirement, where n is the number of elements in...

متن کامل

k-Means for Streaming and Distributed Big Sparse Data

We provide the first streaming algorithm for computing a provable approximation to the k-means of sparse Big data. Here, sparse Big Data is a set of n vectors in R, where each vector has O(1) non-zeroes entries, and d ≥ n. E.g., adjacency matrix of a graph, web-links, social network, document-terms, or image-features matrices. Our streaming algorithm stores at most logn · k input points in memo...

متن کامل

Distributed Application Management in Heterogeneous Grids

Distributing an application on several machines is one of the key aspects of Gridcomputing. In the last few years several groups have developed solutions for the occurring communication problems. However, users are still left on their own when it comes to the handling of a Grid-computer, as soon as they are facing a mix of several Grid software environments on target machines. This paper presen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Algorithms

سال: 2019

ISSN: 1999-4893

DOI: 10.3390/a12030060